245 research outputs found

    Confidence Intervals for Heritability via Haseman-Elston Regression

    Get PDF
    Heritability is the proportion of phenotypic variance in a population that is attributable to individual genotypes. Heritability is considered an important measure in both evolutionary biology and in medicine, and is routinely estimated and reported in genetic epidemiology studies. In population-based genome-wide association studies (GWAS), mixed models are used to estimate variance components, from which a heritability estimate is obtained. The estimated heritability is the proportion of the model\u27s total variance that is due to the genetic relatedness matrix (kinship measured from genotypes). Current practice is to use bootstrapping, which is slow, or normal asymptotic approximation to estimate the precision of the heritability estimate; however, this approximation fails to hold near the boundaries of the parameter space or when the sample size is small. In this paper we propose to estimate variance components via a Haseman-Elston regression, find the asymptotic distribution of the variance components and proportions of variance, and use them to construct confidence intervals (CIs). Our method is further developed to estimate unbiased variance components and construct CIs by meta-analyzing information from multiple studies. We demonstrate our approach on data from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL)

    Control Function Assisted IPW Estimation with a Secondary Outcome in Case-Control Studies

    Get PDF
    Case-control studies are designed towards studying associations between risk factors and a single, primary outcome. Information about additional, secondary outcomes is also collected, but association studies targeting such secondary outcomes should account for the case-control sampling scheme, or otherwise results may be biased. Often, one uses inverse probability weighted (IPW) estimators to estimate population effects in such studies. However, these estimators are inefficient relative to estimators that make additional assumptions about the data generating mechanism. We propose a class of estimators for the effect of risk factors on a secondary outcome in case-control studies, when the mean is modeled using either the identity or the log link. The proposed estimator combines IPW with a mean zero control function that depends explicitly on a model for the primary disease outcome. The efficient estimator in our class of estimators reduces to standard IPW when the model for the primary disease outcome is unrestricted, and is more efficient than standard IPW when the model is either parametric or semiparametric

    On Negative Outcome Control of Unobserved Confounding as a Generalization of Difference-in-Differences

    Get PDF
    The difference-in-differences (DID) approach is a well known strategy for estimating the effect of an exposure in the presence of unobserved confounding. The approach is most commonly used when pre-and post-exposure outcome measurements are available, and one can assume that the association of the unobserved confounder with the outcome is equal in the two exposure groups, and constant over time. Then, one recovers the treatment effect by regressing the change in outcome over time on the exposure. In this paper, we interpret the difference-in-differences as a negative outcome control (NOC) approach. We show that the pre-exposure outcome is a negative control outcome, as it cannot be influenced by the subsequent exposure, and it is affected by both observed and unobserved confounders of the exposure-outcome association of interest. The relation between DID and NOC provides simple conditions under which negative control outcomes can be used to detect and correct for confounding bias. However, for general negative control outcomes, the DID-like assumption may be overly restrictive and rarely credible, because it requires that both the outcome of interest and the control outcome are measured on the same scale. Thus, we present a scale-invariant generalization of the DID that may be used in broader NOC contexts. The proposed approach is demonstrated in simulations and on a Normative Aging Study data set, in which Body Mass Index is used for NOC of the relationship between air pollution and inflammatory outcomes

    Relationship between margin distance and local recurrence among patients undergoing wedge resection for small (ā‰¤2 cm) nonā€“small cell lung cancer

    Get PDF
    ObjectiveSuccessful pulmonary wedge resection for early-stage nonā€“small cell lung cancer requires a pathologically confirmed negative margin. To date, however, no clear evidence is available regarding whether an optimal margin distance, defined as the distance from the primary tumor to the closest resection margin, exists. Toward addressing this gap, we investigated the relationship between the margin distance and local recurrence risk.MethodsWe reviewed all adult patients who had undergone wedge resection for small (ā‰¤2 cm) nonā€“small cell lung cancer from January 2001 to August 2011, with follow-up through to December 31, 2011. The exclusion criteria included other active noncutaneous malignancies, bronchoalveolar carcinomas, lymph node or distant metastases at diagnosis, large cell cancer, adenosquamous cancer, multiple, multifocal, and/or metastatic disease, and previous chemotherapy or radiotherapy. Using Cox regression analysis, we examined the relationship between the margin distance and interval to local recurrence, adjusting for chronic obstructive pulmonary disease, forced expiratory volume in 1 second, smoking, diabetes, tumor size, tumor location, surgeon, open versus video-assisted thoracoscopic surgery, and whether the lymph nodes were sampled.ResultsOf 557 consecutive adult patients, 479 met our inclusion criteria. The overall, unadjusted 1- and 2-year local recurrences rate was 5.7% and 11.0%, respectively. From the adjusted analyses, an increased margin distance was significantly associated with a lower risk of local recurrence (PĀ =Ā .033). Patients with a 10-mm margin distance had a 45% lower local recurrence risk than those with a 5-mm distance (hazard ratio, 0.55; 95% confidence interval, 0.35-0.86). Beyond 15 mm, no evidence of additional benefit was associated with an increased margin distance.ConclusionsIn wedge resection for small nonā€“small cell lung cancer, increasing the margin distance ā‰¤15 mm significantly decreased the local recurrence risk, with no evidence of additional benefit beyond 15 mm

    A Powerful Statistical Framework for Generalization Testing in GWAS, with Application to the HCHS/SOL

    Get PDF
    In GWAS, ā€œgeneralizationā€ is the replication of genotype-phenotype association in a population with different ancestry than the population in which it was first identified. The standard for reporting findings from a GWAS requires a two-stage design, in which discovered associations are replicated in an independent follow-up study. Current practices for declaring generalizations rely on testing associations while controlling the Family Wise Error Rate (FWER) in the discovery study, then separately controlling error measures in the follow-up study. While this approach limits false generalizations, we show that it does not guarantee control over the FWER or False Discovery Rate (FDR) of the generalization null hypotheses. In addition, it fails to leverage the two-stage design to increase power for detecting generalized associations. We develop a formal statistical framework for quantifying the evidence of generalization that accounts for the (in)consistency between the directions of associations in the discovery and follow-up studies. We develop the directional generalization FWER (FWERg) and FDR (FDRg) controlling r-values, which are used to declare associations as generalized. This framework extends to generalization testing when applied to a published list of SNP-trait associations. We show that our framework accommodates various SNP selection rules for generalization testing based on p-values in the discovery study, and still control FWERg or FDRg. A key finding is that it is often beneficial to use a more lenient p-value threshold then the genome-wide significance threshold. For instance, in a GWAS of Total Cholesterol (TC) in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), when testing all SNPs with p-values\u3c 5 Ɨ 10āˆ’8 (15 genomic regions) for generalization in a large GWAS of whites, we generalized SNPs from 15 regions. But when testing all SNPs with p-values\u3c 6.6Ɨ10āˆ’5 (89 regions), we generalized SNPs from 27 regions
    • ā€¦
    corecore